-
Notifications
You must be signed in to change notification settings - Fork 752
feat: Add warmup functionality to reduce search latency #195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: Add warmup functionality to reduce search latency #195
Conversation
- Add enable_warmup parameter to HNSW and DiskANN embedding servers - Implement warmup() method on LeannSearcher for manual pre-warming - Auto-warmup option during LeannSearcher initialization (enable_warmup=True) - Pre-load embedding model at server startup to avoid cold-start latency - Add comprehensive tests for warmup functionality Fixes yichuan-w#177 (search recompute latency) Fixes yichuan-w#159 (warmup strategy)
|
Thanks, this is a known issue for a long time, we will look into that!! cc @andylizf , and can you fix the lint error here? |
- Remove unused imports (tempfile, Path, MagicMock) - Fix import order (stdlib before third-party) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
|
The macOS-13 CI jobs show as 'cancelled' rather than 'failed' - this appears to be a GitHub Actions runner issue, not a code problem. All other builds (macos-14, macos-15, ubuntu) passed successfully. Could you please re-run the cancelled macOS-13 jobs? |
|
Sure, I will do that later, sorry for the late responses since I was on vacation. And thanks again for your contribution! |
|
No worries, thanks for re-running the CI! Let me know if there's anything else that needs to be addressed. |
|
Thanks for implementing this warmup feature! The functionality looks good and solves a real latency problem. A few suggestions for potential future improvements (not blocking for this PR): 1. Extract common warmup logicThe warmup code in # leann/warmup.py
def warmup_embedding_model(model_name: str, embedding_mode: str, provider_options=None) -> float:
"""Pre-load embedding model by computing a dummy embedding."""
...2. Clarify
|
|
What do you think? @yichuan-w |
Summary
enable_warmupparameter to HNSW and DiskANN embedding servers to pre-load model at startupwarmup()method onLeannSearcherfor manual pre-warming before first searchLeannSearcherinitialization (enable_warmup=True)Test Plan
tests/test_warmup.pyFixes #177
Fixes #159